Solution: Trust No One

There is no universal remedy against SQL Injection. Let's learn how we can use different techniques for different scenarios.

There is no single technique for securing our SQL code. A good approach is to learn all of the following techniques and use them in appropriate cases.

Filter input#

Instead of wondering whether some input contains harmful content, we should strip away any characters that aren’t valid for that input. That is, if we need an integer, we should use only the part of the content that comprises an integer. The best way to do this depends on our programming language; for example, in PHP, we should use the filter extension:

<?php 
$bugid = filter_input(INPUT_GET, "bugid", FILTER_SANITIZE_NUMBER_INT);
$sql = "SELECT * FROM Bugs WHERE bug_id = {$bugid}";
$stmt = $pdo->query($sql);
?>

We can use type casting functions for simple cases like numbers:

<?php
$bugid = intval($_GET["bugid"]);
$sql = "SELECT * FROM Bugs WHERE bug_id = {$bugid}";
$stmt = $pdo->query($sql);
?>

Rule #31: Check the back seat

We can also use regular expressions to match safe substrings, filtering out illegitimate content:

<?php 
$sortorder = "date_reported"; // default
if (preg_match("/[_[:alnum:]]+/", $_GET["order"], $matches)) { }
$sortorder = $matches[1];
$sql = "SELECT * FROM Bugs ORDER BY {$sortorder}";
$stmt = $pdo->query($sql);
?>

Parameterize dynamic values#

When the dynamic parts of our query are simple values, we should use query parameters to separate them from SQL expressions.

<?php 
$sql = "UPDATE Accounts SET password_hash = SHA2(?) WHERE account_id = ?";
$stmt = $pdo->prepare($sql);
$params = array($_REQUEST["password"], $_REQUEST["userid"]);
$stmt->execute($params);
?>

We saw examples in the previous two lessons that a parameter can substitute only for a single value. If we add the parameter values after the RDBMS parses the SQL statement, no SQL Injection attack can change the syntax of the parameterized query. Even if an attacker tries to use a malicious parameter value such as 123 OR TRUE, the RDBMS interprets the parameter as a value. At worst, the query fails to apply to any row, but it’s not likely to apply to the wrong rows.

It would be best to use query parameters when we need to combine application variables as literal values in SQL expressions.

The malicious value would result in a relatively safe SQL statement equivalent to the following:

Retrieving data after updating Accounts table to set the password

In the given playground, we can see the values in the password_hash column. All the values have been updated, and now these are the same. We can see the difference in the output of both the widgets given above.

We should use query parameters when we need to combine application variables as literal values in SQL expressions.

Quoting dynamic values#

Query parameters are usually the best solution, but in rare cases, a query with parameter placeholders causes the query optimizer to make odd decisions about which indexes to use.

For example, let’s suppose we have a column in the Accounts table called is_active. This column stores a true value for 99 percent of the rows, giving it an uneven distribution of values. A query that searches for is_active = false would benefit from an index, but it would be a waste to read the index for a query searching for is_active = true. However, if we used a parameter in the expression is_active = ? the optimizer wouldn’t be able to know which value we will supply when we execute the prepared query, so it’s liable to choose the wrong optimization plan.

In exotic cases like this, it could be better to interpolate values directly into the SQL statement, in spite of the general recommendation to use query parameters. If we do this, we should quote the strings carefully.

<?php 
$sql = "SELECT * FROM Accounts WHERE is_active = {$quoted_active}";
$quoted_active = $pdo->quote($_REQUEST["active"]);
$stmt = $pdo->query($sql);
?>

We need to remember to make sure that we are using a function that is mature and well-tested against obscure SQL security issues. Most data access libraries include such a string-quoting function. For example, in PHP, we use PDO::quote(). However, it’s best not to implement our own quoting function unless we have studied the security risks thoroughly.

Isolate user input from code#

Query parameters and escaping techniques help us combine literal values into SQL expressions. Still, they don’t help with other parts of a statement, such as table or column identifiers, or SQL keywords. It would be best if we had another solution to make these parts of a query dynamic.

Parameterizing an IN() predicate

Let’s suppose our users are deciding how to sort a list of bugs. Do they sort them by the bugs’ status or by the bugs’ date of creation? They also want to select the direction of sorting. The following playground can be used for sorting the bugs by status while we retrieve the data from Bugs.

Retrieving data from sorted Bugs table by status

The following playground shows the data of the Bugs table sorted by the date reported.

Retrieving data from sorted Bugs table by date reported

In the following example, a PHP script accepts request parameters order and dir, and our code interpolates these user choices into the SQL query to be a column name and a keyword.

<?php
$sortorder = $_REQUEST["order"];
$direction = $_REQUEST["dir"];
$sql = "SELECT * FROM Bugs ORDER BY $sortorder $direction";
$stmt = $pdo->query($sql);
?>

The script assumes that order contains the name of a column and that dir contains either ASC or DESC. This is not a safe assumption, because a user can send any parameter values in a web request.

Instead, we can use the request parameters to look up predefined values and then use these values in our SQL query.

  1. Let’s declare a $sortorders array that maps user choices as keys and SQL column names as values. Let’s also declare a $directions array that maps user choices as keys and SQL keywords ASC and DESC as values.

    <?php
    $sortorders = array( "status" => "status", "date" => "date_reported" );
    $directions = array( "up" => "ASC", "down" => "DESC" );
    ?>
  1. Now we set the variables $sortorder and $dir to default values in case the user’s choices aren’t in the arrays.

    <?php
    $sortorder = "bug_id";
    $direction = "ASC";
    ?>

If the user’s choices match the array keys we declared in $sortorders and $directions, we use the corresponding values.

<?php
if (array_key_exists($_REQUEST["order"], $sortorders)) {
$sortorder = $sortorders[ $_REQUEST["order"] ];
}

if (array_key_exists($_REQUEST["dir"], $directions)) {
$direction = $directions[ $_REQUEST["dir"] ];
}
?>
  1. Now it’s safe to use the $sortorder and $direction variables in our SQL query because they can contain only values we declared in our code.
    <?php
    $sql = "SELECT * FROM Bugs ORDER BY {$sortorder} {$direction}";
    $stmt = $pdo->query($sql);
    ?>

Using this technique has several advantages:

  • We never combine user input with our SQL query, so we reduce the risk of SQL Injection.
  • We can make any part of an SQL statement dynamic, including identifiers, SQL keywords, and even entire expressions.
  • We have an easy and efficient way to validate user choices.
  • We decouple the internal details of our database queries from the user interface.

The choices are hard-coded in our application, but this is appropriate for table names, column names, and SQL keywords. Choices over the full range of strings or numbers are typical for data values but not for identifiers or syntax.

Get a buddy to review your code#

The best way to catch flaws is to get another pair of eyes to look at them. It’s always a good idea to ask a teammate who is familiar with SQL Injection risks to help us inspect our code. We mustn’t let pride or ego keep us from doing this — we may be embarrassed now over missing a coding mistake, but it would still be much better than having to admit responsibility later for a security flaw that allowed hackers to exploit our website.

In an inspection for SQL Injection, we can ask our teammate to use the following guidelines:

  1. Find SQL statements that are formed using application variables, string concatenation, or replacement.

  2. Trace the origin of all dynamic content used in our SQL statements. Find any data that comes from an external source, such as user input, files, environment, web services, third-party code, or even a string fetched from the database.

  3. Assume any external content is potentially hazardous. Use filters, validators, and mapping arrays to transform untrusted content.

  4. Combine external data into our SQL statements using query parameters or robust escaping functions.

  5. Don’t forget to inspect our stored procedures and other places where we may find dynamic SQL statements.

Code inspection is the most accurate and economical way to find SQL Injection flaws. We should budget our time for this and treat it as a mandatory activity. We can also return the favor by inspecting our teammates’ code.

The Quest for a Cure
Synopsis: Pseudo Key Neat-Freak
Mark as Completed
Report an Issue